{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 第九章 优先级队列"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "队列遵循先进先出的原则，完全按照入队的顺序依次离队，然而在有些情况下，不能完全按照先进先出的原则，比如医院急诊、机票候补、滴滴打车算法等。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "优先级队列：有优先级的元素的集合，允许插入任意的元素、删除优先级最高的元素；优先级队列的键值对表示元素的优先级和元素，键的值越小优先级越高，因此允许删除键的值最小的键值对。键不一定是数字，可以是自定义大小比较准则的类。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "优先级队列ADT：\n",
    "\n",
    "1. P.add(k, v): 插入一个键值对(k, v)，键k表示优先级，值v表示元素。\n",
    "2. P.min(): 返回优先级队列P中优先级最高的键值对（以元组表示），但不移除，如果P为空则报错。\n",
    "3. P.remove_min(): 返回并移除优先级最高的键值对元组，如果P为空则报错。\n",
    "4. P.is_empty(): 如果优先级队列不包含任何元组，则返回True。\n",
    "5. len(p): 优先级队列中键值对元组的数量。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "如果优先级最高的键值对有多对（即多个键相等），则返回（和删除）任意一个优先级最高的键值对元组。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在一般的优先级队列中，一个元素一旦加入优先级队列，键值对元组将保持不变，之后进行扩展，可以更新现有键值对的键。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 9.2 优先级队列的实现"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 9.2.1 组合设计模式"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "组合设计模式，编写一个嵌套类，嵌套类由两个及以上的实例变量用以存储，形成类似于元组、列表的结构。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "class PriorityQueueBase:\n",
    "    \n",
    "    class _Item:\n",
    "        \n",
    "        __slots__ = '_key', '_value'\n",
    "        \n",
    "        def __init__(self, key, value):\n",
    "            self._key = key\n",
    "            self._value = value\n",
    "        \n",
    "        def __It__(self, other):                           ## 定义组合类的大小，由键即优先级决定大小\n",
    "            return self._key < other._key\n",
    "    \n",
    "    def is_empty(self):\n",
    "        return len(self) == 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 9.2.2 使用未排序列表实现优先级队列"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "与`FavoritesList`类似，运用适配器模式，用`PositionalList`（本质是双向链表的封装）存储`Item`作为`self._data`。注：位置列表的所有操作都是O(1)，如果一个类的功能可以用位置列表来实现，那么使用位置列表往往是效率高的。（这里相当于用双向链表实现优先级队列）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "未排序：每次添加元素，都创建一个新的`_Item`实例，添加到`self._data`的末端，因此`add`方法的时间复杂度为O(1)，同时由于`self._data`未排序，所以`min`和`remove_min`方法的时间复杂度为O(n)，由于`self._data`的`__len__`方法时间复杂度为O(1)，而`self._data`的长度即是优先级队列的长度，所以剩余的方法时间复杂度为O(1)。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "操作 | 运行时间\n",
    ":-: | :-:\n",
    "len | O(1)\n",
    "is_empty | O(1)\n",
    "add | O(1)\n",
    "min | O(1)\n",
    "remove_min | O(1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "class UnsortedPriorityQueue(PriorityQueueBase):\n",
    "    \n",
    "    def __init__(self):\n",
    "        self._data = PositionalList()\n",
    "    \n",
    "    def _find_min(self):                                         ## 用于支持min和remove_min方法，返回优先级最高的元素的位置\n",
    "        if self.is_empty():\n",
    "            raise Empty('Priority queue is empty')\n",
    "        min_position = self._data.first()\n",
    "        walk = self._data.after(self._data.first())\n",
    "        while walk is not None:\n",
    "            if walk.element() < min_position.element():\n",
    "                min_position = walk\n",
    "            walk = self._data.after(walk)\n",
    "        return min_position\n",
    "    \n",
    "    def __len__(self):\n",
    "        return len(self._data)\n",
    "    \n",
    "    def add(self, key, value):\n",
    "        self._data.add_last(self._Item(key, value))\n",
    "    \n",
    "    def min(self):\n",
    "        min_element = self._find_min().element()\n",
    "        return min_element._key, min_element._value\n",
    "    \n",
    "    def remove_min(self):\n",
    "        min_position = self._find_min()\n",
    "        self._data.delete(min_position)\n",
    "        min_element = min_position\n",
    "        return min_element._key, min_element._value"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 9.2.3 使用排序列表实现优先级队列"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "使用排序列表实现的优先级队列的add方法变慢了，min方法变快了。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "class SortedPriorityQueue(PriorityQueueBase):\n",
    "    \n",
    "    def __init__(self):\n",
    "        self._data = PositionalList()\n",
    "    \n",
    "    def __len__(self):\n",
    "        return len(self._data)\n",
    "    \n",
    "    def add(self, key, value):\n",
    "        new_Item = self._Item(key, value)\n",
    "        walk = self._data.last()\n",
    "        while walk is not None and walk.element() > new_Item:\n",
    "            walk = self._data.before(walk)\n",
    "        if walk is None:\n",
    "            self._data.add_first(new_Item)\n",
    "        else:\n",
    "            self._data.add_after(walk, new_Item)\n",
    "    \n",
    "    def min(self):\n",
    "        if self.is_empty():\n",
    "            raise Empty('Priority queue is empty')\n",
    "        min_element = self._data.first().element()\n",
    "        return min_element._key, min_element._value\n",
    "    \n",
    "    def remove_min(self):\n",
    "        if self.is_empty():\n",
    "            raise Empty('Priority queue is empty')\n",
    "        min_position = self._data.first()\n",
    "        min_element = min_position.element()\n",
    "        self._data.delete(min_position)\n",
    "        return min_element._key, min_element._value"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "操作 | 未排序列表 | 排序列表\n",
    ":-: | :-: | :-: \n",
    "len | O(1) | O(1)\n",
    "is_empty | O(1) | O(1)\n",
    "add | O(1) | O(n)\n",
    "min | O(n) | O(1)\n",
    "remove_min | O(n) | O(1) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 9.3 堆"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "用未排序列表和排序列表实现优先级队列，需要在插入和取最有限元素的时间复杂度上做出权衡。使用`二进制堆`实现优先级队列可以以对数时间复杂度实现插入和删除操作，原理是使用二叉树的数据结构在元素的完全无序和完全有序中进行折中。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 9.3.1 堆的数据结构"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "堆是一棵二叉树，每个节点（封装成了位置）存储键值对元组（在优先级队列中存储元组），且满足子节点的键的值不小于父节点（即父节点的优先级总是更高，不过在一些其他的情况下可能会反过来）。堆的顶部是二叉树的根节点。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "堆的高度越小，效率是越高的。当保持堆为完全二叉树时，堆的高度是最小的。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "堆基于两个原则构建：\n",
    "\n",
    "1. 父节点的键和子节点的键的大小关系\n",
    "2. 二叉树的形状是完全二叉树还是不完全二叉树"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "完全二叉树堆：如果高度为h，那么第1层到第h-1层都应该有每一层最多的节点数（$2^i$），第h层的节点保持在左侧。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "基于完全二叉树堆的定义，堆的高度为不大于log(n)的最小整数。因为高度为h的完全二叉树所拥有的节点数是$2^h$到$2^{h+1}-1$。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 9.3.2 使用堆实现优先级队列"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "考虑用堆实现优先级队列：min方法很简单，只需访问根节点或者说堆的顶部，但是add和remove_min方法则有难度，需要结合堆的两个原则进行。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "add方法：\n",
    "\n",
    "1. 在堆中插入一个新节点（位置）存储键值对元组（组合设计模式），按照完全二叉树堆的定义，将新节点先放在最后一层的最右边，如果最后一层已满或者堆是空的，则放在新一层的最左边。这一步满足了`Complete Binary Tree Property`，但是可能违反`Heap-Order Property`。\n",
    "2. 新节点向上冒泡，新节点跟父节点比较，如果新节点优先级高，即键的值小（比父节点小，自然比兄弟节点小），则与父节点交换位置，然后继续向上冒泡直到不违反`Heap-Order Property`，最坏情况下时间复杂度为O(h)，相对于n为对数时间复杂度。\n",
    "\n",
    "新节点与父节点交换键值对元组的方式称为`堆向上冒泡（up-heap bubbling）`。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "remove_min方法：如果直接删除根节点，两棵子树将不连通，如果删除最后一层的最右边的节点，则不会违反完全二叉树堆的形状，因此将用最后一个节点的键值对替换根节点的键值对，然后删除最后一个节点，再给根节点处新的键值对寻找一个合适的位置。如果堆一开始为空，则报错，如果只有根节点，则直接删除，除这两种情况外，需要进行堆向下冒泡，每次取两个子节点中优先级更高的进行比较，如果父节点优先级高，则向下冒泡终止，如果子节点优先级高，则交换，重复以上过程直至终止。整个过程相对于节点数为对数时间复杂度。\n",
    "\n",
    "新的键值对寻找位置的过程称为`堆向下冒泡（down-heap bubbling）`。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "堆向上冒泡由于父节点总是唯一的，所以比较容易；向下冒泡总是需要跟优先级较高的子节点比较。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 9.3.3 基于数组的完全二叉树表示"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "基于数组的二叉树表示的一个缺点在于容易因为二叉树的形状而造成内存浪费，但是完全二叉树堆的形状刚好规避了这个缺点。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "使用基于数组表示的堆来实现优先级队列的优点：\n",
    "\n",
    "1. 内存浪费的缺点不复存在\n",
    "2. 比链式二叉树更方便地定位到最后一个节点（即最后一层的最右边的节点）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "n个节点需要的数组长度为n。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在分析Python实现的时间复杂度时，要注意摊销，因为有`add`和`remove_min`操作。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 9.3.4 Python的堆实现"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Empty:\n",
    "    pass\n",
    "\n",
    "class HeapPriorityQueue(PriorityQueueBase):\n",
    "\n",
    "    def __init__(self):\n",
    "        self._data = []            ## 虽然是二叉树的结构，但是用数组存储，二叉树体现在父节点、子节点的索引上\n",
    "\n",
    "    def _parent(self, j):\n",
    "        return (j - 1) // 2\n",
    "\n",
    "    def _left(self, j):\n",
    "        return 2 * j + 1\n",
    "\n",
    "    def _right(self, j):\n",
    "        return 2 * j + 2\n",
    "\n",
    "    def _has_left(self, j):\n",
    "        return self._left(j) < len(self._data)\n",
    "\n",
    "    def _has_right(self, j):\n",
    "        return self._right(j) < len(self._data)\n",
    "\n",
    "    def _swap(self, i, j):\n",
    "        self._data[i], self._data[j] = self._data[j], self._data[j]\n",
    "\n",
    "    def _upheap(self, j):         ## 向上冒泡\n",
    "        if j > 0:\n",
    "            i = self._parent(j)\n",
    "            if self._data[j] < self._data[i]:\n",
    "                self._swap(i, j)\n",
    "                self._upheap(i)\n",
    "\n",
    "    def _downheap(self, j):\n",
    "        if self._has_left(j):     ## 如果没有子节点，递归就会停止了，这里已经暗含了条件\n",
    "            left = self._left(j)\n",
    "            small_child = left\n",
    "            if self._has_right(j):\n",
    "                right = self._right(j)\n",
    "                if self._data[right] < self._data[left]:\n",
    "                    small_child = right\n",
    "            if self._data[small_child] < self._data[j]:\n",
    "                self._swap(j, small_child)\n",
    "                self._downheap(small_child)\n",
    "\n",
    "    def __len__(self):\n",
    "        return len(self._data)\n",
    "\n",
    "    def add(self, key, value):\n",
    "        self._data.append(self._Item(key, value))\n",
    "        self._upheap(len(self._data) - 1)\n",
    "\n",
    "    def min(self):\n",
    "        if self.is_empty():\n",
    "            raise Empty('Priority queue is empty')\n",
    "        min_Item = self._data[0]\n",
    "        return min_Item._key, min_Item._value\n",
    "\n",
    "    def remove_min(self):\n",
    "        if self.is_empty():\n",
    "            raise Empty('Priority queue is empty')\n",
    "        min_Item = self._data[0]\n",
    "        self._data[0] = self._data[len(self._data) - 1]\n",
    "        del self._data[len(self._data) - 1]\n",
    "        self._downheap(0)\n",
    "        return min_Item._key, min_Item._value"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 9.3.5 基于堆的优先级队列的分析"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "时间复杂度的分析在以下条件下进行：\n",
    "\n",
    "1. 堆在完全二叉树的限制下，高度h为不大于log(n)的整数\n",
    "2. 根部为优先级最高的键值对元组，min方法时间复杂度为O(1)\n",
    "3. add和remove_min方法需要从底向上冒泡或者从顶向下冒泡，时间复杂度由树的高度决定，为O(log(n))，在基于数组的堆中，找到最后一个节点是很容易的事情，时间复杂度O(1)，在基于链表的堆中，找到最后一个节点需要O(log(n))——>见习题。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "操作 | 运行时间\n",
    ":-: | :-: \n",
    "len(P), P.is_empty() | O(1)\n",
    "P.min() | O(1)\n",
    "P.add() | O(log(n))*\n",
    "P.remove_min() | O(log(n))*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "基于数组（list）的堆的add和remove_min方法需要摊销。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 9.3.6 自下而上构建堆"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "连续add很多对键值对元组时可以使用自下而上构建的堆的方式。假设堆的每一层都是满的，从最底层开始，用递归定义自下而上定义堆：\n",
    "\n",
    "1. 最底层，每个键值对元组为一个堆，每两个堆通过父节点连接为一个更大的堆，然后父节点向下冒泡。\n",
    "2. 持续进行第一步直到两个高度为h-1的堆连接为高度为h的堆。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在用数组表示的堆中实现自底向上的构建是很容易的，先将n个键值对元组存储在一个list中，在这个过程，堆的基本框架已经构建好了，只需要满足heap-order就好，方法是按照一定的顺序进行向下冒泡，先对第二层的所有键值对向下冒泡，然后第三层、第四层......"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "之前构建堆是先初始化一个空的堆，然后不断调用add方法，自下而上是直接初始化一个列表，然后冒泡使之符合heap-order。同样是建立一个n个节点的堆，前者时间复杂度为O(nlog(n))，后者为O(n)。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "自下而上构建堆是针对初始化的，之后仍然支持add、remove_min等方法，改一改初始化的方法，代码如下："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Empty:\n",
    "    pass\n",
    "\n",
    "class HeapPriorityQueue(PriorityQueueBase):\n",
    "\n",
    "    def __init__(self, contents=()):\n",
    "        self._data = [self._Item(k, v) for k, v in contents]\n",
    "        if len(self._data) > 1:                                 ## 等于2的情况不会出现，因为假设每层都是满的\n",
    "            self._heapify()\n",
    "    \n",
    "    def _heapify(self):\n",
    "        start = self._parent(len(self) - 1)\n",
    "        for j in range(start, -1, -1):                          ## 因为每一层都是满的，兄弟节点都是相邻的，父节点在子节点之前，所以按顺序遍历\n",
    "            self._downheap(j)\n",
    "\n",
    "    def _parent(self, j):\n",
    "        return (j - 1) // 2\n",
    "\n",
    "    def _left(self, j):\n",
    "        return 2 * j + 1\n",
    "\n",
    "    def _right(self, j):\n",
    "        return 2 * j + 2\n",
    "\n",
    "    def _has_left(self, j):\n",
    "        return self._left(j) < len(self._data)\n",
    "\n",
    "    def _has_right(self, j):\n",
    "        return self._right(j) < len(self._data)\n",
    "\n",
    "    def _swap(self, i, j):\n",
    "        self._data[i], self._data[j] = self._data[j], self._data[j]\n",
    "\n",
    "    def _upheap(self, j):         ## 向上冒泡\n",
    "        if j > 0:\n",
    "            i = self._parent(j)\n",
    "            if self._data[j] < self._data[i]:\n",
    "                self._swap(i, j)\n",
    "                self._upheap(i)\n",
    "\n",
    "    def _downheap(self, j):\n",
    "        if self._has_left(j):     ## 如果没有子节点，递归就会停止了，这里已经暗含了条件\n",
    "            left = self._left(j)\n",
    "            small_child = left\n",
    "            if self._has_right(j):\n",
    "                right = self._right(j)\n",
    "                if self._data[right] < self._data[left]:\n",
    "                    small_child = right\n",
    "            if self._data[small_child] < self._data[j]:\n",
    "                self._swap(j, small_child)\n",
    "                self._downheap(small_child)\n",
    "\n",
    "    def __len__(self):\n",
    "        return len(self._data)\n",
    "\n",
    "    def add(self, key, value):\n",
    "        self._data.append(self._Item(key, value))\n",
    "        self._upheap(len(self._data) - 1)\n",
    "\n",
    "    def min(self):\n",
    "        if self.is_empty():\n",
    "            raise Empty('Priority queue is empty')\n",
    "        min_Item = self._data[0]\n",
    "        return min_Item._key, min_Item._value\n",
    "\n",
    "    def remove_min(self):\n",
    "        if self.is_empty():\n",
    "            raise Empty('Priority queue is empty')\n",
    "        min_Item = self._data[0]\n",
    "        self._data[0] = self._data[len(self._data) - 1]\n",
    "        del self._data[len(self._data) - 1]\n",
    "        self._downheap(0)\n",
    "        return min_Item._key, min_Item._value"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "自底向上堆构建的渐进分析：自底向上构建堆比从空堆开始add的运行时间更短。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "从空堆开始add的时间复杂度为O(nlog(n))，分析的方法是，只考虑底层的叶子节点，大致有n/2个，每个向上冒泡需要log(n)。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "自底向上构建堆之所以会快，是因为冒泡需要多层的节点是在堆上部，而堆上部节点少，如果不断add，底层的节点是最多的，冒泡也是最慢的。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "自底向上构建堆的时间复杂度分析：倒数第二层最多向下冒泡一次，倒数第三层最多向下冒泡两次...全部求和近似为O(n)。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$S_h = 2^{h-1} + 2^{h-2} \\times 2 + ... + 2 \\times (h-1) + h = 2^h-1-h$（错位相减）（假设每一层都是满的）（相对于n为O(n)）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 9.3.7 Python的heapq模块"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "该模块提供基于堆的优先级队列的支持，但不直接提供优先级队列类，而是提供一些函数，将list作为堆来管理。这里可以直观地看出类和函数的区别，类将数据和函数整合成对象，含属性和方法，而函数需要另外输入数据，相当于类属性和方法的分离。除了这里，对比r语言和python的数据框概念也有助于我们理解类。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Heapq模块支持的函数（假设list已经满足heap-order属性）：\n",
    "\n",
    "1. heappush(L, e): 相当于add，时间复杂度O(log(n))\n",
    "2. heappop(L): 相当于remove_min，时间复杂度O(log(n))\n",
    "3. heappushpop(L, e): 1和2的结合，放入元素后，取出并返回优先级最高的键值对元组，效率相比1和2组合要更高，因为如果插入的最小，函数将直接返回，如果不是最小，将插入到堆顶然后向下冒泡。\n",
    "4. heapreplace(L, e): 与3类似，只不过是在放入元素之前执行pop。\n",
    "\n",
    "对于不满足heap-order的list的方法：\n",
    "\n",
    "1. heaprify(L): 让未排序的list满足heap-order（使用的是自底向上构建堆的方法）,时间复杂度O(n)\n",
    "2. nlargest(k, iterable): 生成前k大的值的list，时间复杂度O(n + klog(n))。（应该是先构建堆，然后remove_min执行k次，需要注意堆的heap-order的定义）\n",
    "3. nlargest(k, iterable): 与2同理"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 9.4 使用优先级队列排序"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "使用优先级队列进行排序的方法很简单，将元素构建成一个优先级队列，然后依次remove_min即可。具体的函数不再写了。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 9.4.1 选择排序和插入排序"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "使用优先级队列进行排序的时间复杂度取决于add和remove_min方法的时间复杂度。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "选择排序实际上来源于使用未排序列表实现的优先级队列来实现排序，未排序列表先将所有元素都add，耗时O(n)，然后依次选择min，耗时O($n^2$)。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "插入排序实际上来源于使用已排序列表实现的优先级队列来实现排序，已排序列表将所有元素add的过程体现了`插入`，最后remove_min总时间只需要\n",
    "O(n)。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "以上两种算法的时间复杂度都为O($n^2$)，其中选择排序每次remove_min需要的时间是遍历整个当前列表，没有最坏情况这回事，而插入排序插入的位置可能在第一个就插入了，不要遍历所有已排序部分，因此插入排序是在最坏的情况下达到O($n^2$)，选择排序是必须O($n^2$)，因此在某种意义上插入排序比选择排序更有效率。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 9.4.2 堆排序"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "前面用排序和未排序列表实现优先级队列，综合考虑add和remove_min的时间复杂度，是不如堆实现的优先级队列的。因此用堆实现的优先级队列来实现排序，效率可能更高，这就是`堆排序`。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "堆排序的效率都差不多，不管是自下而上还是自上而下构建堆，因为这两种构建堆的方式区别在于add方法的效率，但是受到之后remove_min方法的影响，所以最终的时间复杂度都差不多。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "构建堆的时间复杂度：O(n)——自下而上，O(nlog(n))——自上而下。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "remove_min的时间复杂度（连续n次）：O(nlog(n))。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "如果元素使用list存储的，正常的思路是将当前list转化为堆，并建立一个新的list存储每次remove_min返回的元素。这里一个更好的方法是引入一个i，记录一个list中为堆的部分，一开始整个list都不是堆，堆只是个空堆，然后通过i从左向右移动，整个list都变成了堆（这其实是不断将i+1处元素add的过程），然后不断remove_min，将remove的元素插入到n-i的部分，i从右向左移动，直到整个list都不再是堆为止，就实现了排序，这样空间复杂度降到了最低，称为`原地堆排序`。（`原地算法`一般指的是除了存储原始的元素列表之外只使用少量的内存）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> 原地堆排序一个奇怪的地方是，每次remove_min的时候都需要改变元素的位置，这样不会导致O($n^2$)的时间复杂度吗？"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 9.5 适应性优先级队列"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "有时候一些元素失去资格将会从优先级队列中移除，显然优先级队列需要一个remove_min之外的remove方法；有些元素的优先级会发生变化，因此需要一个update方法来更新优先级队列的优先级。这些功能需要重新编写一个适应性优先级队列。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 9.5.1 定位器"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "为了移除或者更新特定的元组，需要一个定位器来定位特定的元素，并避免在整个优先级队列中进行搜索。这种定位器在add一个元组时会返回，类似于位置列表的位置。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 9.5.2 适应性优先级队列的实现"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在堆实现的优先级队列中，底层存储的数据结构是list，因此定位器相当于在list中的索引，因此可以将元组进行扩展，新增一维存储索引。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "一个键的update操作需要简单的一次堆向上冒泡或者向下冒泡来重新满足heap-ordershuxing；一次remove操作需要将最后位置的元素移动remove位置，执行冒泡（跟remove_min差不多的流程）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "class AdaptableHeapPriorityQueue(HeapPriorityQueue):\n",
    "    \n",
    "    class Locator(HeapPriorityQueue._Item):\n",
    "        \n",
    "        __slots__ = '_index'\n",
    "        \n",
    "        def __init__(self, k, v, j):\n",
    "            super().__init__(k, v)\n",
    "            self._index = j\n",
    "    \n",
    "    def _swap(self, i, j):\n",
    "        super()._swap(i, j)\n",
    "        self._data[i]._index = i               ## index是随着位置变的，需要自己更新\n",
    "        self._data[j]._index = j\n",
    "    \n",
    "    def _bubble(self, j):\n",
    "        if j > 0 and self._data[j] < self._data[self._parent(j)]:\n",
    "            self._upheap(j)\n",
    "        else:\n",
    "            self._downheap(j)\n",
    "    \n",
    "    def add(self, key, value):\n",
    "        token = self.Locator(key, value, len(self._data))\n",
    "        self._data.append(token)\n",
    "        self._upheap(len(self._data) - 1)\n",
    "        return token\n",
    "    \n",
    "    def update(self, loc, newkey, newval):\n",
    "        j = loc._index\n",
    "        if not (0 <= j < len(self) and self._data[j] is loc):\n",
    "            raise ValueError('Invalid locator')\n",
    "        loc._key = newkey\n",
    "        loc._value = newval\n",
    "        self._bubble(j)\n",
    "    \n",
    "    def remove(self, loc):\n",
    "        j = loc._index\n",
    "        if not (0 <= j < len(self) and self._data[j] is loc):\n",
    "            raise ValueError('Invalid locator')\n",
    "        if j == len(self) - 1:\n",
    "            self.pop()\n",
    "        else:\n",
    "            self._swap(j, len(self) - 1)\n",
    "            self._data.pop()\n",
    "            self._bubble(j)\n",
    "        return (loc._key, loc._value)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "操作 | 运行时间\n",
    ":-: | :-:\n",
    "len(P), P.is_empty(), P.min() | O(1)\n",
    "P.add(k, v) | O(log(n))\\*\n",
    "P.update(loc, k, v) | O(log(n))\n",
    "P.remove(loc) | O(log(n))\\*\n",
    "P.remove_min() | O(log(n))\\*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 9.6 练习"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "R-9.18"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "堆排序时间复杂度：连续n次remove_min，可以只看叶子节点的remove_min，如果最后一层只有一个节点，那么最初的堆的所有叶子节点（(n+1)/2个）都执行下堆操作，需要((n-1)/2)\\*(log(n)-1)+log(n)，为O(nlog(n))，如果最后一层是满的，那么为((n+1)/2)\\*log(n)，也是O(nlog(n))。而除叶子节点之外的加起来不可能超过叶子节点的下堆时间。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "C-9.26、27"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "优先级队列实现栈：越晚push的优先级越高。\n",
    "\n",
    "优先级队列实现队列：越晚enqueue的优先级越低。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "C-9.29"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "用Python的list实现已排序优先级队列。由于pop(0)效率很低，因此将优先级最高的放在list的末尾。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "class SortedPriorityQueue(PriorityQueueBase):\n",
    "    \n",
    "    def __init__(self):\n",
    "        self._data = []\n",
    "    \n",
    "    def __len__(self):\n",
    "        return len(self._data)\n",
    "    \n",
    "    def add(self, key, value):\n",
    "        new_Item = self._Item(key, value)\n",
    "        i = 0\n",
    "        while self._data[i] > new_Item and i < len(self._data):\n",
    "            if i == len(self._data):\n",
    "                self._data.append(new_Item)\n",
    "            else:\n",
    "                self._data.insert(i, new_Item)\n",
    "    \n",
    "    def min(self):\n",
    "        if not self._data:\n",
    "            raise Empty('Priority queue is empty')\n",
    "        min_element = self._data[len(self._data) - 1]\n",
    "        return min_element._key, min_element._value\n",
    "    \n",
    "    def remove_min(self):\n",
    "        if not self._data:\n",
    "            raise Empty('Priority queue is empty')\n",
    "        min_element = self._data.pop()\n",
    "        return min_element._key, min_element._value"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.2"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
